Goto

Collaborating Authors

 membership weight


Federated Causal Inference from Multi-Site Observational Data via Propensity Score Aggregation

Rémi, Khellaf, Aurélien, Bellet, Julie, Josse

arXiv.org Artificial Intelligence

Causal inference typically assumes centralized access to individual-level data. Yet, in practice, data are often decentralized across multiple sites, making centralization infeasible due to privacy, logistical, or legal constraints. We address this problem by estimating the Average Treatment Effect (ATE) from decentralized observational data via a Federated Learning (FL) approach, allowing inference through the exchange of aggregate statistics rather than individual-level data. We propose a novel method to estimate propensity scores by computing a federated weighted average of local scores with Membership Weights (MW)--probabilities of site membership conditional on covariates--which can be flexibly estimated using parametric or non-parametric classification models. Unlike density ratio weights (DW) from the transportability and generalization literature, which either rely on strong modeling assumptions or cannot be implemented in FL, MW can be estimated using standard FL algorithms and are more robust, as they support flexible, non-parametric models--making them the preferred choice in multi-site settings with strict data-sharing constraints. The resulting propensity scores are used to construct Federated Inverse Propensity Weighting (Fed-IPW) and Augmented IPW (Fed-AIPW) estimators. Unlike meta-analysis methods, which fail when any site violates positivity, our approach leverages heterogeneity in treatment assignment across sites to improve overlap. We show that Fed-IPW and Fed-AIPW perform well under site-level heterogeneity in sample sizes, treatment mechanisms, and covariate distributions. Both theoretical analysis and experiments on simulated and real-world data highlight their advantages over meta-analysis and related methods.




Deep Inductive Logic Programming meets Reinforcement Learning

Bueff, Andreas, Belle, Vaishak

arXiv.org Artificial Intelligence

One approach to explaining the hierarchical levels of understanding within a machine learning model is the symbolic method of inductive logic programming (ILP), which is data efficient and capable of learning first-order logic rules that can entail data behaviour. A differentiable extension to ILP, so-called differentiable Neural Logic (dNL) networks, are able to learn Boolean functions as their neural architecture includes symbolic reasoning. We propose an application of dNL in the field of Relational Reinforcement Learning (RRL) to address dynamic continuous environments. This represents an extension of previous work in applying dNL-based ILP in RRL settings, as our proposed model updates the architecture to enable it to solve problems in continuous RL environments. The goal of this research is to improve upon current ILP methods for use in RRL by incorporating non-linear continuous predicates, allowing RRL agents to reason and make decisions in dynamic and continuous environments.


Human Perception as a Phenomenon of Quantization

Aerts, Diederik, Arguëlles, Jonito Aerts

arXiv.org Artificial Intelligence

For two decades, the formalism of quantum mechanics has been successfully used to describe human decision processes, situations of heuristic reasoning, and the contextuality of concepts and their combinations. The phenomenon of 'categorical perception' has put us on track to find a possible deeper cause of the presence of this quantum structure in human cognition. Thus, we show that in an archetype of human perception consisting of the reconciliation of a bottom up stimulus with a top down cognitive expectation pattern, there arises the typical warping of categorical perception, where groups of stimuli clump together to form quanta, which move away from each other and lead to a discretization of a dimension. The individual concepts, which are these quanta, can be modeled by a quantum prototype theory with the square of the absolute value of a corresponding Schr\"odinger wave function as the fuzzy prototype structure, and the superposition of two such wave functions accounts for the interference pattern that occurs when these concepts are combined. Using a simple quantum measurement model, we analyze this archetype of human perception, provide an overview of the experimental evidence base for categorical perception with the phenomenon of warping leading to quantization, and illustrate our analyses with two examples worked out in detail.


Fuzzy Clustering with Similarity Queries

Huleihel, Wasim, Mazumdar, Arya, Pal, Soumyabrata

arXiv.org Machine Learning

The fuzzy or soft $k$-means objective is a popular generalization of the well-known $k$-means problem, extending the clustering capability of the $k$-means to datasets that are uncertain, vague, and otherwise hard to cluster. In this paper, we propose a semi-supervised active clustering framework, where the learner is allowed to interact with an oracle (domain expert), asking for the similarity between a certain set of chosen items. We study the query and computational complexities of clustering in this framework. We prove that having a few of such similarity queries enables one to get a polynomial-time approximation algorithm to an otherwise conjecturally NP-hard problem. In particular, we provide probabilistic algorithms for fuzzy clustering in this setting that asks $O(\mathsf{poly}(k)\log n)$ similarity queries and run with polynomial-time-complexity, where $n$ is the number of items. The fuzzy $k$-means objective is nonconvex, with $k$-means as a special case, and is equivalent to some other generic nonconvex problem such as non-negative matrix factorization. The ubiquitous Lloyd-type algorithms (or, expectation-maximization algorithm) can get stuck at a local minima. Our results show that by making few similarity queries, the problem becomes easier to solve. Finally, we test our algorithms over real-world datasets, showing their effectiveness in real-world applications.


Jointly Discriminative and Generative Recurrent Neural Networks for Learning from fMRI

Dvornek, Nicha C., Li, Xiaoxiao, Zhuang, Juntang, Duncan, James S.

arXiv.org Machine Learning

Recurrent neural networks (RNNs) were designed for dealing with time-series data and have recently been used for creating predictive models from functional magnetic resonance imaging (fMRI) data. However, gathering large fMRI datasets for learning is a difficult task. Furthermore, network interpretability is unclear. To address these issues, we utilize multitask learning and design a novel RNN-based model that learns to discriminate between classes while simultaneously learning to generate the fMRI time-series data. Employing the long short-term memory (LSTM) structure, we develop a discriminative model based on the hidden state and a generative model based on the cell state. The addition of the generative model constrains the network to learn functional communities represented by the LSTM nodes that are both consistent with the data generation as well as useful for the classification task. We apply our approach to the classification of subjects with autism vs. healthy controls using several datasets from the Autism Brain Imaging Data Exchange. Experiments show that our jointly discriminative and generative model improves classification learning while also producing robust and meaningful functional communities for better model understanding.


On the EM-Tau algorithm: a new EM-style algorithm with partial E-steps

Fajardo, Val Andrei, Liang, Jiaxi

arXiv.org Machine Learning

The EM algorithm is one of many important tools in the field of statistics. While often used for imputing missing data, its widespread applications include other common statistical tasks, such as clustering. In clustering, the EM algorithm assumes a parametric distribution for the clusters, whose parameters are estimated through a novel iterative procedure that is based on the theory of maximum likelihood. However, one major drawback of the EM algorithm, that renders it impractical especially when working with large datasets, is that it often requires several passes of the data before convergence. In this paper, we introduce a new EM-style algorithm that implements a novel policy for performing partial E-steps. We call the new algorithm the EM-Tau algorithm, which can approximate the traditional EM algorithm with high accuracy but with only a fraction of the running time.


Context and Interference Effects in the Combinations of Natural Concepts

Aerts, Diederik, Arguëlles, Jonito Aerts, Beltran, Lester, Beltran, Lyneth, de Bianchi, Massimiliano Sassoli, Sozzo, Sandro, Veloz, Tomas

arXiv.org Artificial Intelligence

Philosophers and psychologists have always been interested in the deep nature of human concepts, how they are formed, how they combine to create more complex conceptual structures, as expressed by sentences and texts, and how meaning is created in these processes. Unveiling aspects of these mysteries is bound to have a massive impact on a variety of domains, from knowledge representation to natural language processing, machine learning and artificial intelligence. The original idea of a concept as a'container of objects', called'instantiations', which can be traced back to Aristotle, was challenged by the first cognitive tests by Eleanor Rosch, which revealed that concepts exhibit aspects, like'context-dependence', 'vagueness' and'graded typicality', that prevent a too naïve definition of a concept as a'set of defining properties that are either possessed or not possessed by individual exemplars' [1, 2]. More, these tests infused the suspicion that concepts do not combine by following the algebraic rules of classical logic. A first attempt to preserve a set theoretical modeling came from the'fuzzy set approach': concepts would be represented by fuzzy sets, while their conjunction (disjunction) satisfies the'minimum (maximum) rule of fuzzy set conjunction (disjunction)' [3]. However, also this approach was confuted by a whole set of experiments by cognitive psychologists, including Osherson and Smith, who identified the'Guppy effect' (or'Pet-Fish problem') in typicality judgments [4], James Hampton, who discovered'overextension' and'underextension' effects in membership judgments [5, 6], and Alxatib and Pelletier, who detected'borderline contradictions' in simple propositions of the form "John is tall and John is not tall" [7]. More recently, some of us proved that these data violate Kolmogorov's axioms of classical probability theory [8], thus revealing that classical structures,


Quantum Structure in Cognition, Origins, Developments, Successes and Expectations

Aerts, Diederik, Sozzo, Sandro

arXiv.org Artificial Intelligence

We provide an overview of the results we have attained in the last decade on the identification of quantum structures in cognition and, more specifically, in the formalization and representation of natural concepts. We firstly discuss the quantum foundational reasons that led us to investigate the mechanisms of formation and combination of concepts in human reasoning, starting from the empirically observed deviations from classical logical and probabilistic structures. We then develop our quantum-theoretic perspective in Fock space which allows successful modeling of various sets of cognitive experiments collected by different scientists, including ourselves. In addition, we formulate a unified explanatory hypothesis for the presence of quantum structures in cognitive processes, and discuss our recent discovery of further quantum aspects in concept combinations, namely, 'entanglement' and 'indistinguishability'. We finally illustrate perspectives for future research.